A Proposal for Incremental Dialogue Evaluation

نویسندگان

  • Madeleine Bates
  • Damaris M. Ayuso
چکیده

The SIS community has made progress recently toward evaluating SLS systems that deal with dialogue, but there is still considerable work that needs to be done in this area. Our goal is to develop incremental ways to evaluate dialogue processing, not just going from Class D1 (dialogue pairs) to Class D2 (dialogue triples), but measuring aspects of dialogue processing other than length. We present two suggestions; one for extending the common evaluation procedures for dialogues, and one for modifying the scoring metric. I N T R O D U C T I O N There is no single dialogue problem. By its nature, dialogue processing is composed of many different capabilities matched to many different aspects of the problem. It is reasonable to expect that dialogue evaluation methodologies should be multffaceted to reflect this richness of structure. Ideally, each new addit ion to the set of evaluat ion methodologies should test a different aspect of dialogue processing, and should be harder than the methodologies that came before iL We present two suggestions: one which extends the common evaluation procedure in order to test one new aspect of dialogues, and one which modifies the scoring metric. Difference: 1. Conversation is cooperative, but a game is competitive. 2. In chess, the goal is clear (checkmate), but in a conversational dialogue, the goal is less dear. 3. In a chess game, any state can be completely and concisely represented by a single board position; in a dialogue it is not known what comprises a state, nor how to represent it. Like the game tree for chess, the human/computer dialogue tree is enormous, as indicated in figure 1. There are usually hundreds or thousands of alternatives the human may produce. The number of responses the system can make is much smaller; some responses may be clearly wrong, but seldom is there a single "right" or "best" response (just as there is seldom a single such move in chess). Even when striving for the same goal, two different people are very likely to choose very different paths. An Analogy with Chess We as a community have been thinking about dialogue evaluation in terms of whether the systems we are building give the "right" answer (the one the wizard gave, or the one agreed upon by the Principles of Interpretation) at every step. We have been trying to come up with a methodology to measure whether our systems can reproduce the wizard's answers at each step of a lengthy dialogue. But is this a reasonable approach? Participating in a dialogue, whether between two humans or between a human and a machine, bears a striking resemblance to playing a complex game such as chess.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collaborating on Utterances with a Spoken Dialogue System Using an ISU-based Approach to Incremental Dialogue Management

When dialogue systems, through the use of incremental processing, are not bounded anymore by strict, nonoverlapping turn-taking, a whole range of additional interactional devices becomes available. We explore the use of one such device, trial intonation. We elaborate our approach to dialogue management in incremental systems, based on the Information-State-Update approach, and discuss an implem...

متن کامل

The INPROTK 2012 Release: A Toolkit for Incremental Spoken Dialogue Processing

We describe the 2012 release of INPROTK, our “Incremental Processing Toolkit” which combines a powerful and extensible architecture for incremental processing with components for incremental speech recognition and, new to this release, incremental speech synthesis. These components work domainindependently; we also provide example implementations of higher-level components such as natural langu...

متن کامل

Toward incremental dialogue act segmentation in fast-paced interactive dialogue systems

In this paper, we present and evaluate an approach to incremental dialogue act (DA) segmentation and classification. Our approach utilizes prosodic, lexico-syntactic and contextual features, and achieves an encouraging level of performance in offline corpus-based evaluation as well as in simulated human-agent dialogues. Our approach uses a pipeline of sequential processing steps, and we investi...

متن کامل

Development and Usability Evaluation of an Online Tutorial for “How to Write a Proposal” for Medical Sciences Students

Background and Objective: Considering the importance of learning how to write a proposal for students, this study was performed to develop an online tutorial for “How to write a Proposal” for students and to evaluate its usability. Methods: This study is a developmental research and tool design. “Gamified Online Tutorial based on Self-Determination Theory (GOT-STD) Framework" became the basis f...

متن کامل

Incremental Dialogue Processing in a Micro-Domain

This paper describes a fully incremental dialogue system that can engage in dialogues in a simple domain, number dictation. Because it uses incremental speech recognition and prosodic analysis, the system can give rapid feedback as the user is speaking, with a very short latency of around 200ms. Because it uses incremental speech synthesis and self-monitoring, the system can react to feedback f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991